We are migrating the bug tracker to github Issues. This is now the preferred way to report NASM bugs.
Self-registration is disabled due to spam issue (mail gorcunov@gmail.com or hpa@zytor.com to create an account)
The NASM in ~/proj/nasmrc is 03490692b0082fe16a1936a5774f4326a55075e6 The one in ~/proj/nasm is 7a5502142b735fe62866963fa0bf3182808996b2 The other two revisions are in directories named after their commit IDs. It appears that the fix to https://bugzilla.nasm.us/show_bug.cgi?id=3392949 introduced this bug, in the commit https://github.com/netwide-assembler/nasm/commit/21c977e717d7ecea03275810dcc11c082d4f20f0 In the tests, both uses of a segment value with an adjustment are encoded as 0000h in the object file's LEDATA16 records, where we expect 0010h for the one and FFC6h for the other. test$ cat test.asm DOSENTRYADJUSTSEGMENT equ 60h - 26h DOSENTRYADJUSTOFFSET equ DOSENTRYADJUSTSEGMENT * 16 DOSENTRYDEVICEBASE equ 10h %idefine PTR section DOSENTRY dw AUXDEV2 - DOSENTRYDEVICEBASE * 16, seg AUXDEV2 + DOSENTRYDEVICEBASE AUXDEV2: ENTRYPOINT: MOV WORD PTR [ENTRYPOINT+3], DOSENTRY - DOSENTRYADJUSTSEGMENT test$ ~/proj/nasmrc/nasm test.asm -fobj -o testrc.obj test$ ~/proj/nasm/nasm test.asm -fobj -o testnew.obj test$ ~/proj/omfdump/omfdump testrc.obj > testrc.txt test$ ~/proj/omfdump/omfdump testnew.obj > testnew.txt test$ diff -u testrc.txt testnew.txt --- testrc.txt 2025-09-03 16:27:09.961394125 +0200 +++ testnew.txt 2025-09-03 16:27:15.453514907 +0200 @@ -1,10 +1,9 @@ 80 THEADR 10 bytes, checksum 3F (valid) 0000: 08 74 65 73 74 2e 61 73-6d : .test.asm -88 COMENT 36 bytes, checksum E7 (valid) +88 COMENT 33 bytes, checksum 7E (valid) [NP=0 NL=0 UD=00] 00 Translator - 0002: 20 54 68 65 20 4e 65 74-77 69 64 65 20 41 73 73 : The Netwide Ass - 0012: 65 6d 62 6c 65 72 20 32-2e 31 36 2e 30 32 72 63 : embler 2.16.02rc - 0022: 32 : 2 + 0002: 1d 54 68 65 20 4e 65 74-77 69 64 65 20 41 73 73 : .The Netwide Ass + 0012: 65 6d 62 6c 65 72 20 32-2e 31 37 72 63 30 : embler 2.17rc0 96 LNAMES 11 bytes, checksum DF (valid) [0001] '' 0000: 00 08 44 4f 53 45 4e 54-52 59 : . @@ -17,9 +16,9 @@ 88 COMENT 4 bytes, checksum 91 (valid) [NP=0 NL=1 UD=00] A2 Link pass separator 0002: 01 : . -a0 LEDATA16 14 bytes, checksum A5 (valid) +a0 LEDATA16 14 bytes, checksum 7A (valid) segment 'DOSENTRY', offset 0000 - 0000: 04 ff 10 00 c7 06 07 00-c6 ff : .......... + 0000: 04 ff 00 00 c7 06 07 00-00 00 : .......... 9c FIXUPP16 17 bytes, checksum D7 (valid) FIXUP segment-relative, type 1 (16-bit offset) record offset 0000 test$ ~/proj/nasmtest/21c977e717d7ecea03275810dcc11c082d4f20f0/nasm test.asm -fobj -o test21c9.obj test$ ~/proj/nasmtest/7d5e549d6385ffef050d830317d7663e88d2986e/nasm test.asm -fobj -o test7d5e.obj test$ ~/proj/omfdump/omfdump test21c9.obj > test21c9.txt test$ ~/proj/omfdump/omfdump test7d5e.obj > test7d5e.txt test$ diff -u testrc.txt test7d5e.txt --- testrc.txt 2025-09-03 16:27:09.961394125 +0200 +++ test7d5e.txt 2025-09-03 16:29:45.500812211 +0200 @@ -1,10 +1,10 @@ 80 THEADR 10 bytes, checksum 3F (valid) 0000: 08 74 65 73 74 2e 61 73-6d : .test.asm -88 COMENT 36 bytes, checksum E7 (valid) +88 COMENT 36 bytes, checksum E5 (valid) [NP=0 NL=0 UD=00] 00 Translator 0002: 20 54 68 65 20 4e 65 74-77 69 64 65 20 41 73 73 : The Netwide Ass 0012: 65 6d 62 6c 65 72 20 32-2e 31 36 2e 30 32 72 63 : embler 2.16.02rc - 0022: 32 : 2 + 0022: 34 : 4 96 LNAMES 11 bytes, checksum DF (valid) [0001] '' 0000: 00 08 44 4f 53 45 4e 54-52 59 : . test$ diff -u testrc.txt test21c9.txt --- testrc.txt 2025-09-03 16:27:09.961394125 +0200 +++ test21c9.txt 2025-09-03 16:29:35.712597242 +0200 @@ -1,10 +1,10 @@ 80 THEADR 10 bytes, checksum 3F (valid) 0000: 08 74 65 73 74 2e 61 73-6d : .test.asm -88 COMENT 36 bytes, checksum E7 (valid) +88 COMENT 36 bytes, checksum E5 (valid) [NP=0 NL=0 UD=00] 00 Translator 0002: 20 54 68 65 20 4e 65 74-77 69 64 65 20 41 73 73 : The Netwide Ass 0012: 65 6d 62 6c 65 72 20 32-2e 31 36 2e 30 32 72 63 : embler 2.16.02rc - 0022: 32 : 2 + 0022: 34 : 4 96 LNAMES 11 bytes, checksum DF (valid) [0001] '' 0000: 00 08 44 4f 53 45 4e 54-52 59 : . @@ -17,9 +17,9 @@ 88 COMENT 4 bytes, checksum 91 (valid) [NP=0 NL=1 UD=00] A2 Link pass separator 0002: 01 : . -a0 LEDATA16 14 bytes, checksum A5 (valid) +a0 LEDATA16 14 bytes, checksum 7A (valid) segment 'DOSENTRY', offset 0000 - 0000: 04 ff 10 00 c7 06 07 00-c6 ff : .......... + 0000: 04 ff 00 00 c7 06 07 00-00 00 : .......... 9c FIXUPP16 17 bytes, checksum D7 (valid) FIXUP segment-relative, type 1 (16-bit offset) record offset 0000 test$
omfdump is from https://github.com/boeckmann/omfdump/
Ah, nice... I may want to "steal" that one for NASM as well.
Created attachment 411943 [details] Possible solution patch
I have attached a patch which I think might be able to resolve this problem. Do you think you could test it out?
I was about to push a new revision of lDOS built using the 2025 August git NASM. Luckily, I compared all four files (instsect.com, format.exe, share.exe, msbiow.exe) using my ident86 tool. The differences in msbiow.exe turned out to include this bug's traces of zeroes rather than the adjustments for segment references. The new kernel build also, it turns out, failed to boot. Created using this command: ~/proj/ident86/ident86.py -s aug/msbiow.exe sep/msbiow.exe sep/msbio.tls sep/msbiow.map | tee msbio.txt Result uploaded to https://pushbx.org/ecm/test/20250903/ident/msbio.txt
(In reply to H. Peter Anvin from comment #4) > I have attached a patch which I think might be able to resolve this problem. > Do you think you could test it out? Yes, I tested it. I didn't build the kernel files yet but the test1.exe from https://bugzilla.nasm.us/show_bug.cgi?id=3392949 and the test.asm from this report appear to both work with that.
I have checked in this fix. I'm leaving this as PENDING/FIXED for now; I will close it formally after you verify it solves your problem.
(In reply to H. Peter Anvin from comment #7) > I have checked in this fix. > > I'm leaving this as PENDING/FIXED for now; I will close it formally after > you verify it solves your problem. The ident86 report on the msbiow.exe file is as follows (minus verbose details): ident86 version: hg 6c01457081ae lDebug version: "lDebug (2025-03-09)" Number of files: 4 File 1: [...]/build-dl-wwwecm/msdos4/src/BIOS/msbiow.exe File 2: src/BIOS/msbiow.exe Trace listing file: src/BIOS/msbio.tls WarpLink map file: src/BIOS/msbiow.map Not merged map ranges: Merged map ranges: MZ executable header detected, size = 512 bytes EOF1 reached at 80592 bytes EOF2 reached at 80592 bytes Files are the same length (80592 bytes) Amount different bytes: 55 Amount different lines: 0 Amount not different ranges: 39 The kernel also boots. I compared instsect.com, format.exe, and share.exe as well and they all seem to be identicalised (only "no difference" ranges).
I assume that NASM changed its encoding choices "fingerprint" by accident. It does present challenges in comparing files.
This, unfortunately, is often a result of internal changes as new formats are supported. However, if there are specific things that are giving you a headache, at least let me know so I can see if it is (a) not a bug and (b) trivially addressable...
(In reply to H. Peter Anvin from comment #10) > This, unfortunately, is often a result of internal changes as new formats > are supported. However, if there are specific things that are giving you a > headache, at least let me know so I can see if it is (a) not a bug and (b) > trivially addressable... I don't think there's any bugs, because ident86 detected all of them as "no difference". (This means same instruction length + same semantic meaning.) I redid the identicalisation setup that I used to detect this bug, but with your patch applied this time. If you want, you can check every listed byte change to figure out what instruction it is a part of. (I may add an option to ident86 soon to help with that.) The instsect.com file doesn't have a corresponding .tls file yet because it is an -f bin output file, but ident86 happened to match all instruction boundaries correctly anyway it seems. The compared binaries, listing files, map files, sorted section files, and full ident86 reports (including verbose details) are found in https://pushbx.org/ecm/test/20250903/ident.new/ To find the instructions corresponding to an ident86 file offset, you have to subtract the MZ header size (200h) then look up the address in the .srt file. Then determine the base of the named section using the .tls file's announcements, like in << === Switch to base=002450h -> "DOSCODECODE" >>, subtract that base from your address, and find the result as the .tls machine code dump offset. To filter out the desired section from the .tls file for easier search, you could use something like the following command: ~/proj/tractest/listvars.pl msbiow.map msbio.tls --filter-section=SYSINITGROUP The .txt and .srt files were generated using commands like these: ~/proj/ident86/ident86.py -s aug/msbiow.exe sep/msbiow.exe sep/msbio.tls sep/msbiow.map | tee msbio.txt ~/proj/tractest/sortmap.pl msbiow.map --skip-empty --list-align > msbiow.srt
Added a -D option to ident86: https://hg.pushbx.org/ecm/ident86/rev/a33e4f6e652d This is to make it dump the mismatching instructions even if they are semantically the same, in the .log files eg https://pushbx.org/ecm/test/20250903/ident.new/msbio.log Example: 004B69 first:C3 != second:D8 004B71 first:C3 != second:D8 004B68 up to below 004B81, first=004B69 last=004B71 first: 004B68 +2 xchg al, bl second: samesame first: 004B6A +1 inc di second: samesame first: 004B6B +3 mov cx, 000A second: samesame first: 004B6E +2 rep movsw second: samesame first: 004B70 +2 xchg al, bl second: samesame The first:/second: lines list the numeric mismatches, while the side-by-side disassembly lists what instructions they are a part of. Both use file offsets as addresses. The plus numbers give the length of each instruction. In this case, NASM encodes xchg al, bl differently from the way it used to. In instsect.log you can see that it doesn't know where exactly to start disassembly, so you get 16 bytes before the different bytes. And it sometimes guesses instruction boundaries wrong, albeit this is not a major problem here. This is due to the lack of a .tls file.
As an example, to find the .tls trace listing position that corresponds to file offset 004B68h, you: - Subtract the 200h for the MZ header, yielding 4968h. - Look up 4968h in the sorted .map sections file, https://pushbx.org/ecm/test/20250903/ident.new/sep/msbiow.srt This is the match: DOSCODECODE DOSCODEGROUP s=04903h l=01D1h a=1 dos/search - Look up DOSCODECODE section base in the .tls file: === Switch to base=002450h -> "DOSCODECODE" - Subtract the base from 4968h, resulting in 2518h. - Search for that number as the 8-hexit start offset of a machine code dump, finding https://hg.pushbx.org/ecm/tlsfiles/file/9abe4783ff05/msbio.tls#l79917 0 00002518 86C3 XCHG AL,BL ; Search byte to BL, user byte to AL 0 0000251A 47 INC DI 116 ; STOSB ; Store the correct "user" drive byte 117 ; at the start of the search info 0 0000251B B90A00 MOV CX,20/2 0 0000251E F3A5 REP MOVSW ; Rest of search cont info, SI -> entry 0 00002520 86C3 XCHG AL,BL ; User drive byte back to BL, search 121 ; byte to AL 0 00002522 AA STOSB ; Search contin drive byte at end of 123 ; contin info - Locate source text corresponding to this listing: https://hg.pushbx.org/ecm/msdos4/file/637cfcc5a4d1/src/DOS/search.nas#l114
Here's the msbio ident86 report with -D -j -J. This dumps the relevant listing file lines and uses escape sequences to highlight the different bytes in the hexdump. I'd forgotten that the j options could do this. File at https://pushbx.org/ecm/test/20250903/ident.new/msbio.ext
It seems all differences are xchg reg,reg instructions, in share, format, and msbio. Probably in instsect too but due to lacking a .tls file the -j -J options to ident86 cannot work.
Retried with new ident86 options -D -Y. The instsect changes are all xchg reg,reg as well.
Both NASM and NDISASM swap xchg reg,reg operands (where neither reg is ax). test$ cat test.asm xchg bx, dx xchg cx, dx test$ nasm test.asm -l /dev/stderr 1 2 00000000 87DA xchg bx, dx 3 00000002 87CA xchg cx, dx test$ ndisasm test 00000000 87DA xchg bx,dx 00000002 87CA xchg cx,dx test$ ~/proj/nasmtest/patch/nasm test.asm -l /dev/stderr 1 2 00000000 87D3 xchg bx, dx 3 00000002 87D1 xchg cx, dx test$ ~/proj/nasmtest/patch/ndisasm test 00000000 87D3 xchg bx,dx 00000002 87D1 xchg cx,dx test$ ndisasm test 00000000 87D3 xchg dx,bx 00000002 87D1 xchg dx,cx test$
For a patch fixing the fingerprints refer to https://bugzilla.nasm.us/show_bug.cgi?id=3392951